Speak, Edit, Repeat: High-Fidelity Voice Editing and Zero-Shot TTS with Cross-Attentive Mamba
arxiv.org·3d
🎚️Voice AI Systems
Show HN: I built a video-to-text tool – 10 min free daily, no signup
harku.io·5h·
Discuss: Hacker News
🎚️Audio Codecs
Show HN: AI Voice AudioBook – Convert ebooks to audio with your cloned voice
zan.chat·6h·
Discuss: Hacker News
🎚️Voice AI Systems
Show HN: Nanowakeword – Automates custom wake word model training
github.com·7h·
Discuss: Hacker News
🎙️Whisper
From RNNs to ChatGPT: The Paper That Changed How AI Thinks 🤖
dev.to·2h·
Discuss: DEV
🏗️AI Infrastructure
Towards a Typology of Strange LLM Chains-of-Thought
lesswrong.com·21h
💻Local LLMs
AI receptionist that answers real phone calls
news.ycombinator.com·1h·
Discuss: Hacker News
🧠AI
Making Machines Sound Sarcastic: LLM-Enhanced and Retrieval-Guided Sarcastic Speech Synthesis
arxiv.org·1d
🎤Voice Interfaces
The key to conversational speech recognition
datasciencecentral.com·1d
🎤Voice Interfaces
Detecting and Mitigating Insertion Hallucination in Video-to-Audio Generation
arxiv.org·15h
🎤Voice Interfaces
Own your AI: Learn how to fine-tune Gemma 3 270M and run it on-device
developers.googleblog.com·1d
💻Local LLMs
MuFFIN: Multifaceted Pronunciation Feedback Model with Interactive Hierarchical Neural Modeling
arxiv.org·3d
🎙️Whisper
Prompt Engineering Templates That Work: 7 Copy-Paste Recipes for LLMs
kdnuggets.com·1d
🧩Low-code
Harmonizing AI Voices: Bridging the Gap in Intelligent Communication
dev.to·2d·
Discuss: DEV
🎤Voice Interfaces
How Google Translate & ChatGPT Work: The Transformer, Unboxed
dev.to·1d·
Discuss: DEV
🎙️Whisper
Everyday AI Agents
oreilly.com·7h
🤖AI agents
RND1: Simple, Scalable AR-to-Diffusion Conversion
radicalnumerics.ai·23h·
Discuss: Hacker News
🏗️AI Infrastructure
Can OpenAI build a social network?
maxread.substack.com·2h·
Discuss: Substack
🏗️AI Infrastructure
Show HN: NitNab a macOS 26 Privacy Centric AI Transcription App
github.com·4h·
Discuss: Hacker News
🗣️Voice Coding